Do Text-to-Speech Synthesisers Pronounce Correctly? A Preliminary Study
نویسندگان
چکیده
This paper evaluates 4 commercial text-to-speech synthesisers used by dyslexic people to listen to and proof read text. Two evaluators listened to 704 common English words and determined whether the words were correctly pronounced or not. Where the evaluators agree on incorrect pronunciation, the proportion of correct pronunciations for the four synthesisers is in the range 98.9% to 99.6% of the 704 words. The evaluators also listened to the same synthesisers speaking phrases in which there were 44 pairs of homographs and determined whether each instance of the homograph was correctly spoken or not. The level of correctness for the four synthesisers ranged from 76.3% to 91.3%.
منابع مشابه
Speech Synthesis of Code-Mixed Text
Most Text to Speech (TTS) systems today assume that the input text is in a single language and is written in the same language that the text needs to be synthesized in. However, in bilingual and multilingual communities, code mixing or code switching occurs in speech, in which speakers switch between languages in the same utterance. Due to the popularity of social media, we now see code-mixing ...
متن کاملAutomatic phonetisation for Icelandic
As a part of my final thesis in language technology, I created a speech synthesiser using the free MBROLA system. MBROLA is a project designed to make speech synthesisers for as many languages as possible available for free. It does not require a lot of technological prowess for the general user to create such a synthesiser: all that is required is segmented speech data, and the rest is handled...
متن کاملComparing text-driven and speech-driven visual speech synthesisers
We present a comparison of a text-driven and a speech driven visual speech synthesiser. Both are trained using the same data and both use the same Active Appearance Model (AAM) to encode and re-synthesise visual speech. Objective quality, measured using correlation, suggests the performance of both approaches is close, but subjective opinion ranks the text-driven approach significantly higher.
متن کاملOn evaluating synthesised visual speech
This paper describes issues relating to the subjective evaluation of synthesised visual speech. Two approaches to synthesis are compared: a text-driven synthesiser and a speech-driven synthesiser. Both synthesisers are trained using the same data and both use the same model for rendering the synthesised visual speech. Naturalness is used as a performance metric, and the naturalness of real visu...
متن کاملEvolution of Text-to-Speech Systems and Methods of Their Assessment
The paper gives a retrospective of the development of speech synthesis systems, from mechanical synthesisers to computer systems for text-to-speech conversion (TTS) and analyses the perspectives of biomechanical and multimodal TTS systems within dialogue systems addressing higher cognitive levels as well. Special attention is given to the methods for assessment of the quality of synthesised spe...
متن کامل